Photo credit: Division of Aquatic Resources

Overview:

This study looks at how fish abundance is influenced by habitat complexity in West Hawaii on Hawaii Island. All data came from the Division of Aquatic Resources’s West Hawaii Aquarium Project. Here, I investigated how habitat complexity influences fish abundance at 9 sites during 2018. I found that herbivore abundance was influenced by slope but not surface area. Piscivore and corallivore abundances were not influenced by either habitat complexity metric. I did find that corallivore abundance was influenced by herbivore abundance. Knowing how habitat complexity influences fish abundance is important when making decisions regarding conservation efforts, particularly establishing Marine Protected Areas. Modeling these parameters over several years will give a better understanding of these interactions.

Introduction:

Habitat complexity has been found to have a direct effect on reef fish assemblages (Friedlander at al. 2003; Wedding et al. 2008; Burns et al. 2015). More complex benthic structure lead to larger fish populations. This is important information for many management decisions, particuarly when deciding where Marine Protected Areas (MPAs) should be created (Friedlander et al. 2003). The data used here were collected by the Division of Aquatic Resources (DAR) as a part of their West Hawaii Aquarium Project (WHAP) investigating the impacts of aquarium fish collecting and the effectiveness of MPAs in Hawaii. The study area consisted of 9 sites spanning West Hawaii on Hawaii Island.

With this data, I will determine if habitat complexity influenced fish assemblages in these 9 West Hawaii sites during 2018. From there, I will investigate how different groups of reef fishes, particularly herbivores, piscivores, and corallivores, were impacted by complexity individually. Lastly, I will explore how these different groups of fishes influenced the abundance of each other.

Specific questions I will answer:

  1. Are herbivores, corallivores, and/or piscivores infuenced by habitat complexity?

  2. Does the abundance of herbivorous fishes affect the abundance of corallivorous fishes?

Initial Data Analysis:

# fish abundance dataframe:
a <-read.csv("data/abundance.csv", header = TRUE, sep = ",")

# habitat complexity dataframe:
c <-read.csv("data/complexity.csv", header = TRUE, sep = ",")

Based on the tables below:

head(a)
##   ï..ID SITE      LAT      LONG Corallivore Herbivore Piscivore
## 1     1 H_LA 20.16000 -155.9002     0.07000  0.656875  0.019375
## 2     3 H_WK 20.07392 -155.8645     0.06125  0.958125  0.033125
## 3     4 H_PU 19.96988 -155.8488     0.02375  1.229375  0.045625
## 4     7 H_KL 19.84395 -155.9810     0.05625  1.471250  0.023125
## 5    11 H_HH 19.67098 -156.0303     0.11250  1.985625  0.046875
## 6    15 H_NK 19.56838 -155.9693     0.04000  0.856250  0.013750
str(a)
## 'data.frame':    10 obs. of  7 variables:
##  $ ï..ID      : int  1 3 4 7 11 15 20 21 23 24
##  $ SITE       : Factor w/ 10 levels "H_HH","H_KA",..: 5 10 9 4 1 7 3 2 8 6
##  $ LAT        : num  20.2 20.1 20 19.8 19.7 ...
##  $ LONG       : num  -156 -156 -156 -156 -156 ...
##  $ Corallivore: num  0.07 0.0612 0.0238 0.0563 0.1125 ...
##  $ Herbivore  : num  0.657 0.958 1.229 1.471 1.986 ...
##  $ Piscivore  : num  0.0194 0.0331 0.0456 0.0231 0.0469 ...
head(c)
##   ï..ID surfAaverage slopeaverage
## 1     1     15.01382     29.79281
## 2     3     19.20849     41.84790
## 3     4     21.22624     43.28495
## 4     7     17.06109     38.93240
## 5    15     21.28033     44.59651
## 6    20     20.21769     44.08648
str(c)
## 'data.frame':    9 obs. of  3 variables:
##  $ ï..ID       : int  1 3 4 7 15 20 21 23 24
##  $ surfAaverage: num  15 19.2 21.2 17.1 21.3 ...
##  $ slopeaverage: num  29.8 41.8 43.3 38.9 44.6 ...

We can see that the fish abundance dataframe has 7 variables with 10 different observations and the complexity dataframe has 3 variables with 9 observations.

These data was collected by the DAR for their annual WHAP project. All data viewed here are from the year 2018. There is metadata to explain the abbreviations for site names and units for all measurements.

Issues so far:

The column for “Local ID” shows up as, “ï..ID”…weird.

# fix for abundance dataframe:
colnames(a)
## [1] "ï..ID"       "SITE"        "LAT"         "LONG"        "Corallivore"
## [6] "Herbivore"   "Piscivore"
names(a)[names(a) == "ï..ID" ] <- "LocalID"
colnames(a)
## [1] "LocalID"     "SITE"        "LAT"         "LONG"        "Corallivore"
## [6] "Herbivore"   "Piscivore"
# fix for complexity dataframe:
colnames(c)
## [1] "ï..ID"        "surfAaverage" "slopeaverage"
names(c)[names(c) == "ï..ID" ] <- "LocalID"
colnames(c)
## [1] "LocalID"      "surfAaverage" "slopeaverage"

All fixed! Now onto…

Tidy Data:

All data are now tidy and the two dataframes are ready to be joined!

First, let’s look for primary keys:

c %>%
count(LocalID) %>%
filter(n >1)
## # A tibble: 0 x 2
## # ... with 2 variables: LocalID <int>, n <int>
c %>%
count(LocalID) %>%
filter(n >1)
## # A tibble: 0 x 2
## # ... with 2 variables: LocalID <int>, n <int>

Local ID is a primary key (what I can join tables by) for both abundance and complexity dataframes.

Now we can join!

library(knitr)
library(kableExtra)
new.dat <- full_join(a,c, by = "LocalID", copy = FALSE, suffix = c(".a", ".c"))

new.dat %>%
  kable() %>%
  kable_styling()
LocalID SITE LAT LONG Corallivore Herbivore Piscivore surfAaverage slopeaverage
1 H_LA 20.16000 -155.9002 0.070000 0.656875 0.019375 15.01382 29.79281
3 H_WK 20.07392 -155.8645 0.061250 0.958125 0.033125 19.20849 41.84790
4 H_PU 19.96988 -155.8488 0.023750 1.229375 0.045625 21.22624 43.28495
7 H_KL 19.84395 -155.9810 0.056250 1.471250 0.023125 17.06109 38.93240
11 H_HH 19.67098 -156.0303 0.112500 1.985625 0.046875 NA NA
15 H_NK 19.56838 -155.9693 0.040000 0.856250 0.013750 21.28033 44.59651
20 H_KE 19.46282 -155.9268 0.060625 1.238125 0.036875 20.21769 44.08648
21 H_KA 19.36915 -155.8974 0.138125 1.973125 0.023125 22.10405 48.23216
23 H_OM 19.16730 -155.9133 0.082500 1.615000 0.039375 18.98956 42.83859
24 H_MA 19.07672 -155.9040 0.128750 1.768125 0.036250 20.32958 46.37959
new.dat %>%
count(LocalID) %>%
filter(n >1)
## # A tibble: 0 x 2
## # ... with 2 variables: LocalID <int>, n <int>

Local ID is still a primary key!

left.join <- a %>% left_join(c)
## Joining, by = "LocalID"
left.join %>%
  kable() %>%
  kable_styling()
LocalID SITE LAT LONG Corallivore Herbivore Piscivore surfAaverage slopeaverage
1 H_LA 20.16000 -155.9002 0.070000 0.656875 0.019375 15.01382 29.79281
3 H_WK 20.07392 -155.8645 0.061250 0.958125 0.033125 19.20849 41.84790
4 H_PU 19.96988 -155.8488 0.023750 1.229375 0.045625 21.22624 43.28495
7 H_KL 19.84395 -155.9810 0.056250 1.471250 0.023125 17.06109 38.93240
11 H_HH 19.67098 -156.0303 0.112500 1.985625 0.046875 NA NA
15 H_NK 19.56838 -155.9693 0.040000 0.856250 0.013750 21.28033 44.59651
20 H_KE 19.46282 -155.9268 0.060625 1.238125 0.036875 20.21769 44.08648
21 H_KA 19.36915 -155.8974 0.138125 1.973125 0.023125 22.10405 48.23216
23 H_OM 19.16730 -155.9133 0.082500 1.615000 0.039375 18.98956 42.83859
24 H_MA 19.07672 -155.9040 0.128750 1.768125 0.036250 20.32958 46.37959
semi.join <- a %>% semi_join(c)
## Joining, by = "LocalID"
semi.join %>%
  kable() %>%
  kable_styling()
LocalID SITE LAT LONG Corallivore Herbivore Piscivore
1 H_LA 20.16000 -155.9002 0.070000 0.656875 0.019375
3 H_WK 20.07392 -155.8645 0.061250 0.958125 0.033125
4 H_PU 19.96988 -155.8488 0.023750 1.229375 0.045625
7 H_KL 19.84395 -155.9810 0.056250 1.471250 0.023125
15 H_NK 19.56838 -155.9693 0.040000 0.856250 0.013750
20 H_KE 19.46282 -155.9268 0.060625 1.238125 0.036875
21 H_KA 19.36915 -155.8974 0.138125 1.973125 0.023125
23 H_OM 19.16730 -155.9133 0.082500 1.615000 0.039375
24 H_MA 19.07672 -155.9040 0.128750 1.768125 0.036250

Semi join isn’t a good fit since the complexity dataframe has missing values for Local ID 11. Because of those missing values, none of the complexity dataframe is added to the new joined dataframe. I’m going to stick with the left join.

Imputations:

library(mice)
library(VIM)
new.dat_plot<-aggr(new.dat, col=c('deep sky blue','gold'), numbers=TRUE, sortVars=TRUE,
labels=names(new.dat), cex.axis=.7, gap=3, ylab=c("Missing data","Pattern"))

## 
##  Variables sorted by number of missings: 
##      Variable Count
##  surfAaverage   0.1
##  slopeaverage   0.1
##       LocalID   0.0
##          SITE   0.0
##           LAT   0.0
##          LONG   0.0
##   Corallivore   0.0
##     Herbivore   0.0
##     Piscivore   0.0

This graph shows me where the missing values are. Now to impute!

new.dat_imp <-mice(new.dat, m = 5, maxit = 50, method = 'pmm', seed = 500)
## 
##  iter imp variable
##   1   1  surfAaverage*  slopeaverage*
##   1   2  surfAaverage*  slopeaverage*
##   1   3  surfAaverage*  slopeaverage*
##   1   4  surfAaverage*  slopeaverage*
##   1   5  surfAaverage*  slopeaverage*
##   2   1  surfAaverage*  slopeaverage*
##   2   2  surfAaverage*  slopeaverage*
##   2   3  surfAaverage*  slopeaverage*
##   2   4  surfAaverage*  slopeaverage*
##   2   5  surfAaverage*  slopeaverage*
##   3   1  surfAaverage*  slopeaverage*
##   3   2  surfAaverage*  slopeaverage*
##   3   3  surfAaverage*  slopeaverage*
##   3   4  surfAaverage*  slopeaverage*
##   3   5  surfAaverage*  slopeaverage*
##   4   1  surfAaverage*  slopeaverage*
##   4   2  surfAaverage*  slopeaverage*
##   4   3  surfAaverage*  slopeaverage*
##   4   4  surfAaverage*  slopeaverage*
##   4   5  surfAaverage*  slopeaverage*
##   5   1  surfAaverage*  slopeaverage*
##   5   2  surfAaverage*  slopeaverage*
##   5   3  surfAaverage*  slopeaverage*
##   5   4  surfAaverage*  slopeaverage*
##   5   5  surfAaverage*  slopeaverage*
##   6   1  surfAaverage*  slopeaverage*
##   6   2  surfAaverage*  slopeaverage*
##   6   3  surfAaverage*  slopeaverage*
##   6   4  surfAaverage*  slopeaverage*
##   6   5  surfAaverage*  slopeaverage*
##   7   1  surfAaverage*  slopeaverage*
##   7   2  surfAaverage*  slopeaverage*
##   7   3  surfAaverage*  slopeaverage*
##   7   4  surfAaverage*  slopeaverage*
##   7   5  surfAaverage*  slopeaverage*
##   8   1  surfAaverage*  slopeaverage*
##   8   2  surfAaverage*  slopeaverage*
##   8   3  surfAaverage*  slopeaverage*
##   8   4  surfAaverage*  slopeaverage*
##   8   5  surfAaverage*  slopeaverage*
##   9   1  surfAaverage*  slopeaverage*
##   9   2  surfAaverage*  slopeaverage*
##   9   3  surfAaverage*  slopeaverage*
##   9   4  surfAaverage*  slopeaverage*
##   9   5  surfAaverage*  slopeaverage*
##   10   1  surfAaverage*  slopeaverage*
##   10   2  surfAaverage*  slopeaverage*
##   10   3  surfAaverage*  slopeaverage*
##   10   4  surfAaverage*  slopeaverage*
##   10   5  surfAaverage*  slopeaverage*
##   11   1  surfAaverage*  slopeaverage*
##   11   2  surfAaverage*  slopeaverage*
##   11   3  surfAaverage*  slopeaverage*
##   11   4  surfAaverage*  slopeaverage*
##   11   5  surfAaverage*  slopeaverage*
##   12   1  surfAaverage*  slopeaverage*
##   12   2  surfAaverage*  slopeaverage*
##   12   3  surfAaverage*  slopeaverage*
##   12   4  surfAaverage*  slopeaverage*
##   12   5  surfAaverage*  slopeaverage*
##   13   1  surfAaverage*  slopeaverage*
##   13   2  surfAaverage*  slopeaverage*
##   13   3  surfAaverage*  slopeaverage*
##   13   4  surfAaverage*  slopeaverage*
##   13   5  surfAaverage*  slopeaverage*
##   14   1  surfAaverage*  slopeaverage*
##   14   2  surfAaverage*  slopeaverage*
##   14   3  surfAaverage*  slopeaverage*
##   14   4  surfAaverage*  slopeaverage*
##   14   5  surfAaverage*  slopeaverage*
##   15   1  surfAaverage*  slopeaverage*
##   15   2  surfAaverage*  slopeaverage*
##   15   3  surfAaverage*  slopeaverage*
##   15   4  surfAaverage*  slopeaverage*
##   15   5  surfAaverage*  slopeaverage*
##   16   1  surfAaverage*  slopeaverage*
##   16   2  surfAaverage*  slopeaverage*
##   16   3  surfAaverage*  slopeaverage*
##   16   4  surfAaverage*  slopeaverage*
##   16   5  surfAaverage*  slopeaverage*
##   17   1  surfAaverage*  slopeaverage*
##   17   2  surfAaverage*  slopeaverage*
##   17   3  surfAaverage*  slopeaverage*
##   17   4  surfAaverage*  slopeaverage*
##   17   5  surfAaverage*  slopeaverage*
##   18   1  surfAaverage*  slopeaverage*
##   18   2  surfAaverage*  slopeaverage*
##   18   3  surfAaverage*  slopeaverage*
##   18   4  surfAaverage*  slopeaverage*
##   18   5  surfAaverage*  slopeaverage*
##   19   1  surfAaverage*  slopeaverage*
##   19   2  surfAaverage*  slopeaverage*
##   19   3  surfAaverage*  slopeaverage*
##   19   4  surfAaverage*  slopeaverage*
##   19   5  surfAaverage*  slopeaverage*
##   20   1  surfAaverage*  slopeaverage*
##   20   2  surfAaverage*  slopeaverage*
##   20   3  surfAaverage*  slopeaverage*
##   20   4  surfAaverage*  slopeaverage*
##   20   5  surfAaverage*  slopeaverage*
##   21   1  surfAaverage*  slopeaverage*
##   21   2  surfAaverage*  slopeaverage*
##   21   3  surfAaverage*  slopeaverage*
##   21   4  surfAaverage*  slopeaverage*
##   21   5  surfAaverage*  slopeaverage*
##   22   1  surfAaverage*  slopeaverage*
##   22   2  surfAaverage*  slopeaverage*
##   22   3  surfAaverage*  slopeaverage*
##   22   4  surfAaverage*  slopeaverage*
##   22   5  surfAaverage*  slopeaverage*
##   23   1  surfAaverage*  slopeaverage*
##   23   2  surfAaverage*  slopeaverage*
##   23   3  surfAaverage*  slopeaverage*
##   23   4  surfAaverage*  slopeaverage*
##   23   5  surfAaverage*  slopeaverage*
##   24   1  surfAaverage*  slopeaverage*
##   24   2  surfAaverage*  slopeaverage*
##   24   3  surfAaverage*  slopeaverage*
##   24   4  surfAaverage*  slopeaverage*
##   24   5  surfAaverage*  slopeaverage*
##   25   1  surfAaverage*  slopeaverage*
##   25   2  surfAaverage*  slopeaverage*
##   25   3  surfAaverage*  slopeaverage*
##   25   4  surfAaverage*  slopeaverage*
##   25   5  surfAaverage*  slopeaverage*
##   26   1  surfAaverage*  slopeaverage*
##   26   2  surfAaverage*  slopeaverage*
##   26   3  surfAaverage*  slopeaverage*
##   26   4  surfAaverage*  slopeaverage*
##   26   5  surfAaverage*  slopeaverage*
##   27   1  surfAaverage*  slopeaverage*
##   27   2  surfAaverage*  slopeaverage*
##   27   3  surfAaverage*  slopeaverage*
##   27   4  surfAaverage*  slopeaverage*
##   27   5  surfAaverage*  slopeaverage*
##   28   1  surfAaverage*  slopeaverage*
##   28   2  surfAaverage*  slopeaverage*
##   28   3  surfAaverage*  slopeaverage*
##   28   4  surfAaverage*  slopeaverage*
##   28   5  surfAaverage*  slopeaverage*
##   29   1  surfAaverage*  slopeaverage*
##   29   2  surfAaverage*  slopeaverage*
##   29   3  surfAaverage*  slopeaverage*
##   29   4  surfAaverage*  slopeaverage*
##   29   5  surfAaverage*  slopeaverage*
##   30   1  surfAaverage*  slopeaverage*
##   30   2  surfAaverage*  slopeaverage*
##   30   3  surfAaverage*  slopeaverage*
##   30   4  surfAaverage*  slopeaverage*
##   30   5  surfAaverage*  slopeaverage*
##   31   1  surfAaverage*  slopeaverage*
##   31   2  surfAaverage*  slopeaverage*
##   31   3  surfAaverage*  slopeaverage*
##   31   4  surfAaverage*  slopeaverage*
##   31   5  surfAaverage*  slopeaverage*
##   32   1  surfAaverage*  slopeaverage*
##   32   2  surfAaverage*  slopeaverage*
##   32   3  surfAaverage*  slopeaverage*
##   32   4  surfAaverage*  slopeaverage*
##   32   5  surfAaverage*  slopeaverage*
##   33   1  surfAaverage*  slopeaverage*
##   33   2  surfAaverage*  slopeaverage*
##   33   3  surfAaverage*  slopeaverage*
##   33   4  surfAaverage*  slopeaverage*
##   33   5  surfAaverage*  slopeaverage*
##   34   1  surfAaverage*  slopeaverage*
##   34   2  surfAaverage*  slopeaverage*
##   34   3  surfAaverage*  slopeaverage*
##   34   4  surfAaverage*  slopeaverage*
##   34   5  surfAaverage*  slopeaverage*
##   35   1  surfAaverage*  slopeaverage*
##   35   2  surfAaverage*  slopeaverage*
##   35   3  surfAaverage*  slopeaverage*
##   35   4  surfAaverage*  slopeaverage*
##   35   5  surfAaverage*  slopeaverage*
##   36   1  surfAaverage*  slopeaverage*
##   36   2  surfAaverage*  slopeaverage*
##   36   3  surfAaverage*  slopeaverage*
##   36   4  surfAaverage*  slopeaverage*
##   36   5  surfAaverage*  slopeaverage*
##   37   1  surfAaverage*  slopeaverage*
##   37   2  surfAaverage*  slopeaverage*
##   37   3  surfAaverage*  slopeaverage*
##   37   4  surfAaverage*  slopeaverage*
##   37   5  surfAaverage*  slopeaverage*
##   38   1  surfAaverage*  slopeaverage*
##   38   2  surfAaverage*  slopeaverage*
##   38   3  surfAaverage*  slopeaverage*
##   38   4  surfAaverage*  slopeaverage*
##   38   5  surfAaverage*  slopeaverage*
##   39   1  surfAaverage*  slopeaverage*
##   39   2  surfAaverage*  slopeaverage*
##   39   3  surfAaverage*  slopeaverage*
##   39   4  surfAaverage*  slopeaverage*
##   39   5  surfAaverage*  slopeaverage*
##   40   1  surfAaverage*  slopeaverage*
##   40   2  surfAaverage*  slopeaverage*
##   40   3  surfAaverage*  slopeaverage*
##   40   4  surfAaverage*  slopeaverage*
##   40   5  surfAaverage*  slopeaverage*
##   41   1  surfAaverage*  slopeaverage*
##   41   2  surfAaverage*  slopeaverage*
##   41   3  surfAaverage*  slopeaverage*
##   41   4  surfAaverage*  slopeaverage*
##   41   5  surfAaverage*  slopeaverage*
##   42   1  surfAaverage*  slopeaverage*
##   42   2  surfAaverage*  slopeaverage*
##   42   3  surfAaverage*  slopeaverage*
##   42   4  surfAaverage*  slopeaverage*
##   42   5  surfAaverage*  slopeaverage*
##   43   1  surfAaverage*  slopeaverage*
##   43   2  surfAaverage*  slopeaverage*
##   43   3  surfAaverage*  slopeaverage*
##   43   4  surfAaverage*  slopeaverage*
##   43   5  surfAaverage*  slopeaverage*
##   44   1  surfAaverage*  slopeaverage*
##   44   2  surfAaverage*  slopeaverage*
##   44   3  surfAaverage*  slopeaverage*
##   44   4  surfAaverage*  slopeaverage*
##   44   5  surfAaverage*  slopeaverage*
##   45   1  surfAaverage*  slopeaverage*
##   45   2  surfAaverage*  slopeaverage*
##   45   3  surfAaverage*  slopeaverage*
##   45   4  surfAaverage*  slopeaverage*
##   45   5  surfAaverage*  slopeaverage*
##   46   1  surfAaverage*  slopeaverage*
##   46   2  surfAaverage*  slopeaverage*
##   46   3  surfAaverage*  slopeaverage*
##   46   4  surfAaverage*  slopeaverage*
##   46   5  surfAaverage*  slopeaverage*
##   47   1  surfAaverage*  slopeaverage*
##   47   2  surfAaverage*  slopeaverage*
##   47   3  surfAaverage*  slopeaverage*
##   47   4  surfAaverage*  slopeaverage*
##   47   5  surfAaverage*  slopeaverage*
##   48   1  surfAaverage*  slopeaverage*
##   48   2  surfAaverage*  slopeaverage*
##   48   3  surfAaverage*  slopeaverage*
##   48   4  surfAaverage*  slopeaverage*
##   48   5  surfAaverage*  slopeaverage*
##   49   1  surfAaverage*  slopeaverage*
##   49   2  surfAaverage*  slopeaverage*
##   49   3  surfAaverage*  slopeaverage*
##   49   4  surfAaverage*  slopeaverage*
##   49   5  surfAaverage*  slopeaverage*
##   50   1  surfAaverage*  slopeaverage*
##   50   2  surfAaverage*  slopeaverage*
##   50   3  surfAaverage*  slopeaverage*
##   50   4  surfAaverage*  slopeaverage*
##   50   5  surfAaverage*  slopeaverage*
##  * Please inspect the loggedEvents
new.dat_imp$imp$surfAaverage
##          1        2        3        4        5
## 5 18.98956 15.01382 20.21769 20.21769 20.21769
new.dat_imp$imp$slopeaverage
##          1        2        3        4        5
## 5 44.59651 44.08648 44.59651 42.83859 44.59651
imputed.data <-complete(new.dat_imp, 3)

imputed.data %>%
  kable() %>%
  kable_styling()
LocalID SITE LAT LONG Corallivore Herbivore Piscivore surfAaverage slopeaverage
1 H_LA 20.16000 -155.9002 0.070000 0.656875 0.019375 15.01382 29.79281
3 H_WK 20.07392 -155.8645 0.061250 0.958125 0.033125 19.20849 41.84790
4 H_PU 19.96988 -155.8488 0.023750 1.229375 0.045625 21.22624 43.28495
7 H_KL 19.84395 -155.9810 0.056250 1.471250 0.023125 17.06109 38.93240
11 H_HH 19.67098 -156.0303 0.112500 1.985625 0.046875 20.21769 44.59651
15 H_NK 19.56838 -155.9693 0.040000 0.856250 0.013750 21.28033 44.59651
20 H_KE 19.46282 -155.9268 0.060625 1.238125 0.036875 20.21769 44.08648
21 H_KA 19.36915 -155.8974 0.138125 1.973125 0.023125 22.10405 48.23216
23 H_OM 19.16730 -155.9133 0.082500 1.615000 0.039375 18.98956 42.83859
24 H_MA 19.07672 -155.9040 0.128750 1.768125 0.036250 20.32958 46.37959

My missing values were from the complexity dataset. Both variables (surface area and slope) were not recorded for all of Local ID 11. While I can impute those values, it doesn’t make sense to do so since it’s taking completely different sites and making an imputation off of that. I’d rather omit local ID 11 from my data and look at how complexity influences fish abundance across the other sites.

Removing Local ID 11 from the dataframe:

fish.complex <- new.dat[-c(5), ]

New dataframe:

## [1] "LocalID"     "SITE"        "LAT"         "LONG"        "Corallivore"
## [6] "Herbivore"   "Piscivore"   "SA"          "Slope"
fish.complex %>%
  kable() %>%
  kable_styling()
LocalID SITE LAT LONG Corallivore Herbivore Piscivore SA Slope
1 1 H_LA 20.16000 -155.9002 0.070000 0.656875 0.019375 15.01382 29.79281
2 3 H_WK 20.07392 -155.8645 0.061250 0.958125 0.033125 19.20849 41.84790
3 4 H_PU 19.96988 -155.8488 0.023750 1.229375 0.045625 21.22624 43.28495
4 7 H_KL 19.84395 -155.9810 0.056250 1.471250 0.023125 17.06109 38.93240
6 15 H_NK 19.56838 -155.9693 0.040000 0.856250 0.013750 21.28033 44.59651
7 20 H_KE 19.46282 -155.9268 0.060625 1.238125 0.036875 20.21769 44.08648
8 21 H_KA 19.36915 -155.8974 0.138125 1.973125 0.023125 22.10405 48.23216
9 23 H_OM 19.16730 -155.9133 0.082500 1.615000 0.039375 18.98956 42.83859
10 24 H_MA 19.07672 -155.9040 0.128750 1.768125 0.036250 20.32958 46.37959

Model:

Let’s make some models!

library(tidyverse)
plot(fish.complex, col = "blue")

herb.model <- lm(Herbivore ~ SA + Slope, data = fish.complex)
summary(herb.model)
## 
## Call:
## lm(formula = Herbivore ~ SA + Slope, data = fish.complex)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.36391 -0.17255  0.05443  0.12777  0.40219 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.25361    0.91076  -0.278   0.7900  
## SA          -0.25311    0.12951  -1.954   0.0985 .
## Slope        0.15383    0.05444   2.825   0.0301 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2952 on 6 degrees of freedom
## Multiple R-squared:  0.6598, Adjusted R-squared:  0.5463 
## F-statistic: 5.817 on 2 and 6 DF,  p-value: 0.03939
## 
## Call:
## lm(formula = Corallivore ~ SA + Slope, data = fish.complex)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.03283 -0.02153 -0.01416  0.02832  0.05493 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.029158   0.111553   0.261    0.803
## SA          -0.020875   0.015863  -1.316    0.236
## Slope        0.010687   0.006668   1.603    0.160
## 
## Residual standard error: 0.03615 on 6 degrees of freedom
## Multiple R-squared:  0.3199, Adjusted R-squared:  0.09325 
## F-statistic: 1.411 on 2 and 6 DF,  p-value: 0.3145
pisc.model <- lm(Piscivore ~ SA + Slope, data = fish.complex)
summary(pisc.model)
## 
## Call:
## lm(formula = Piscivore ~ SA + Slope, data = fish.complex)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.017070 -0.005871  0.003139  0.005705  0.015922 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.0074666  0.0360135   0.207    0.843
## SA          -0.0007528  0.0051211  -0.147    0.888
## Slope        0.0008829  0.0021528   0.410    0.696
## 
## Residual standard error: 0.01167 on 6 degrees of freedom
## Multiple R-squared:  0.09129,    Adjusted R-squared:  -0.2116 
## F-statistic: 0.3014 on 2 and 6 DF,  p-value: 0.7504
fish.model <- lm(Corallivore ~ Herbivore, data = fish.complex)
summary(fish.model)
## 
## Call:
## lm(formula = Corallivore ~ Herbivore, data = fish.complex)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.044907 -0.009965 -0.005621  0.023550  0.036688 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.007242   0.031498  -0.230   0.8247  
## Herbivore    0.061738   0.022973   2.687   0.0312 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02848 on 7 degrees of freedom
## Multiple R-squared:  0.5078, Adjusted R-squared:  0.4375 
## F-statistic: 7.222 on 1 and 7 DF,  p-value: 0.0312

Output:

library(ggplot2)
library(jtools)
library(ggstance)
plot_summs(herb.model, scale = TRUE, plot.distributions = TRUE, inner_ci_level = 0.9, colors = "medium turquoise")

Here is a visualization of the herb.model summary showing that slope influences herbivore fish abundance but surface area does not.

plot_summs(pisc.model, scale = TRUE, plot.distributions = TRUE, inner_ci_level = 0.9, colors = "spring green")

Here, neither slope or surface area influences piscivore fish abundance.

plot_summs(coral.model, scale = TRUE, plot.distributions = TRUE, inner_ci_level = 0.9, colors = "forest green")

Again, no influence on corallivore fish by surface area or slope.

plot_summs(fish.model, scale = TRUE, plot.distributions = TRUE, inner_ci_level = 0.9, colors = "dark olive green")

Corallivore abundance IS influenced by herbivore abundance.

library(ggpubr)
herb.scatter<-ggscatter(fish.complex, x = "Slope", y = "Herbivore", 
          add = "reg.line", conf.int = TRUE, 
          cor.coef = TRUE, cor.method = "pearson",mainlab = NULL,
          xlab = "Average Slope (%)", ylab = "Herbivore Abundance", color = "medium sea green", pch = 3)
herb.scatter

library(plotly)
var_1 <- fish.complex$SA
var_2 <- fish.complex$Slope
var_3 <- fish.complex$Herbivore

Herb3D <- plot_ly(fish.complex, 
             x = var_1, 
             y = var_2, 
             z = var_3)%>%
  
add_markers(color = ~Herbivore) %>%
  
layout(scene = list(xaxis = list(title = "Average Surface Area"), 
                    yaxis = list(title = "Average Slope"), 
                    zaxis = list(title = "Herbivore Abundance")))

Herb3D
coral.scatter<-ggscatter(fish.complex, x = "Herbivore", y = "Corallivore", 
          add = "reg.line", conf.int = TRUE, 
          cor.coef = TRUE, cor.method = "pearson",mainlab = NULL,
          xlab = "Herbivore Abundance", ylab = "Corallivore Abundance", color = "medium orchid", pch = 3)
coral.scatter

Summary:

In 2018, of the 9 sites surveyed by DAR in West Hawaii, herbivore fish abundance was influenced by average slope (p = 0.03), but not by average surface area (p = 0.09) of the benthos. 54.6% of varation in herbivore abundance can be explained by the model. Corallivores were not influenced by either average surface area (p = 0.2) or average slope (p = 0.2), with 9.3% of the variation in their abundance being explained by the model. Piscivore fish were also found to not be influenced by either habitat complexity metric; surface area (p = 0.9), slope (p = 0.6). The model can explain 21.1% of the variation in piscivore abundance. Corallivore fish abundance was found to be influenced by herbivore fish abundance (p = 0.03) at these West Hawaii sites in 2018, with 43.7% of the variation in their abundance being explained by the model.

Chong-Seng et al (2012) broke down fish functional groups further than just herbivores, piscivores, and corallivores, but had similar findings in that herbivores were slightly influenced by habitat complexity and that piscivores were not influenced. Since piscivores are predators, habitat complexity not influencing their abundance makes sense. Our findings differ; corallivore abundance was influenced by habitat complexity in their findings, but not in mine. This study was a snapshot of one year, looking at a single measure of three functional groups by two habitat complexity measures at only 9 sites. To get a better understanding of the interplay between fish abundance and habitat complexity, more data with more variables is needed to widen the scope.

Literature Cited:

Burns JHR, Delparte D, Gates RD, Takabayashi M. 2015. (2015). Integrating Structure-from-motion photogrammetry with geospatial software as a novel technique for quantifying 3D ecological characteristics of coral reefs. Peerj https://doi.org/10.7717/peerj.1077

Chong-Seng KM, Mannering TD, Pratchett MS, Bellwood DR, Graham NAJ. 2012. The influence of coral reef benthic condition on associated fish assemblages. PLoS One 7(8) doi:10.1371/journal.pone.0042167

Friedlander AM, Brown EK, Jokiel PL, Smith WR, Rodgers KS. 2003. Effects of habitat, wave exposure, and marine protected area status on coral reef fish assemblages in the Hawaiian archipelago. Coral Reefs 22:291-305.

Wedding LM, Friedlander AM, McGranaghan M, Yost RS, Monaco ME.Using bathymetric lidar to define nearshore benthic habitat complexity: Implications for management of reef fish assemblages in Hawaii. Remote Sensing of Environment 112:4159-4165.